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Abstract 

We analyse the so-called small-world network model (originally devised 
by Strogatz and Watts), treating it, among other things, as a case study of 
non-linear coupled difference or differential equations. We derive a system 
of evolution equations containing more of the previously neglected (possibly 
relevant) non-linear terms. As an exact solution of this entangled system of 
equations is out of question we develop a (as we think, promising) method 
of enclosing the "exact" solutions for the expected quantities by upper and 
lower bounds, which represent solutions of a slightly simpler system of differ- 
ential equation. Furthermore we discuss the relation between difference and 
differential equations and scrutinize the limits of the spreading idea for ran- 
dom graphs. We then show that there exists in fact a "broad" (with respect 
to scaling exponents) crossover zone, smoothly interpolating between linear 
and logarithmic scaling of the diameter or average distance. We are able to 
corroborate earlier findings in certain regions of phase or parameter space (as 
e.g. the finite size scaling ansatz) but find also deviations for other choices 
of the parameters. Our analysis is supplemented by a variety of numerical 
calculations, which, among other things, quantify the effect of various ap- 
proximations being made. With the help of our analytical results we manage 
to calculate another important network characteristic, the (fractal) dimension, 
and provide numerical values for the case of the small-world network. 

Catchwords: Small- World Networks, Non-linear Difference Equations 



1 Introduction 



As part of a broader interest in complex systems, the analysis of large networks of 
interacting agents or simply certain degrees of freedom is currently under intense 
study. Recently a presumably far-reaching core-concept came to the fore, called the 
small-world effect, (for an incomplete list of references see, for example ^ to To 
put it briefly, the presence of a surprisingly small number of random edges, inserted 
in an initially quite regular graph, may have drastic effects as to the average distance 
between nodes or the expected diameter of the network. These additional random 
edges, called short cuts, may typically connect regions which have been quite a 
distance appart in the original regular graph, thus effecting a drastic shrinkage of 
average distance or diameter in certain regions of parameter space. 

It is perhaps noteworthy that we detected a similar phenomenon in quite a 
different area of modern physics (quantum space-time physics) at almost the same 
time, being completely unaware of similar findings in other fields of natural science. 
We called this phenomenon a microscopic wormhole structure 

To understand this smallworld effect in more quantitative terms, a simple model, 
the so-called Strogatz- Watts-model, was investigated in more detail in [TT ] .[T ^ .[T ^ 
and a little bit later also in 1141 . 

In its most tractable form it consists of N linearly ordered vertices (nodes) with 
periodic boundary condition (i.e. node is linked to node xi). In general each 
node may also be linked to its regular neighbors up to order k. The generic case is 
already given for fc = 1 (i.e. nearest neighbors only or Zjv). 

To mimic the random-rewiring of edges of the original Strogatz- Watts-model, it 
is convenient to superimpose the given regular graph by a random graph, living over 
the same set of vertices. While we prefer to introduce the so-called edge-probability p, 
that is, the independent probability for the existence of an edge between two nodes, 
as is usually done in the random graph framework (see e.g. jl5 | .jl6 p . some authors 
(for certain reasons, which come from the original idea of rewiring existing links) 
made a different choice, referring the probability of a random edge (or shortcut) to 
the number of nodes, A^, in the graph (for the case k = 1!). The relation between 
these two probabilities, p and (j) is descibed at the beginning of the following section. 

Important (random) graph characteristics to be employed in detecting the small- 
world effect are the (expected) diameter of the graph and the mean-distance between 
pairs of nodes. A little bit surprisingly, it turns out to be possible to estimate or 
calculate these quantities in the Strogatz- Watts-model as functions of the two pa- 
rameters, N and p or (f). This is remarkable as one has to deal with two coupled non- 
linear difference or differential equations for the variables f(n) and g{n) described 
in the following section or the figure caption to figure 1 . The degree of nonlinearity 
varies of course depending on the extent of approximations being made. 

The general observation is that, depending on the number of shortcuts in the 
network, there exist several regimes in the parameter space. To put it more suc- 
cinctly, we solve, on the one hand, the equations of the model for fixed p and A^. 
On the other hand, it is interesting to study the limit N —^ oo with p{N) now a 
function vanishing for large A^. That is, in this latter case, we analyse the different 
asymptotic regimes around the "point" (A^ = oo,p = 0). In the first case one can 
study the change of behavior of the network for either A^ fixed and changing p or 
vice versa. 

For very small p (more precisely, very few shortcuts) the average distance, for 
example, scales linearly with A^. For still quite small p one expects a (transition 
like) threshold or, rather, a threshold region, above which the average distance 
(or the diameter) scales roughly like in a sparse random graph, i.e. more or less 
logarithmically. There was a certain debate about the nature of this transition 
zone. We show in the following that instead of a threshold one actually has a 
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relatively "broad" cross- over region in which the scaling changes in a smooth way 
from linear through ^ -^"^[(1 — e)lnA^ + 0(1)] (in first order) to InA^ depending 
on p. More precisely, if we scale p with N and choose iV large the corresponding 
values of the edge probability are p ^ N^^ (linear), p ^ 7X^^(1+"=) (intermediate), 
p ~ (logarithmically), respectively. The original threshold was (in our units) 
conjectured to occur for p ~ iV~^. 

Another interesting conjecture, which was then corroborated both numerically 
and by plausibility arguments, was a finite-size-scaling ansatz for the shape of the 
functional dependence of the average distance, i, on N and p. We were able to 
confirm this ansatz modulo some deviations which occurred in a certain region of 
the parameter space. 

At the end of the paper we introduce a (fractal-like) notion of dimension for 
networks and calculate the dimension of the small- world-network. 

To briefly characterize our own approach, we include, on the one hand, more 
possibly relevant terms making hence the evolution equations more complicated. 
In contrast to using then approximate solutions we manage to derive upper and 
lower bound comparison difference equations (differential equations) for the "exact" 
solutions, which allow us to enclose them from above and below. By this method 
we are able to compare the reliability of the various (approximate) results produced 
in the literature, relate them to our exact bounds and represent them in a single 
diagram. Last but not least we were quite scrupulous to compensate for the possible 
quantitative errors (coming from overcounting) which are easily introduced by being 
too cavalier as to the (thumb rule like) spreading argument frequently envoked for 
random graphs. 

We expect that our method of providing comparison difference or differential 
equations for complicated non-linear equations, which, on their side, are presumably 
not solvable, may represent a strategy which might prove useful in a more general 
context. 

2 The Description of the Small World Model 

We start from the graph Zat, i.e. N nodes on a line with periodic boundary condi- 
tion; that is, node xn is linked to node xi. 

Remark: To make the red thread of our analysis better visible, we treat for the time 
being only the nearest neighbor model. A node Xi is only connected to Xi±i (i.e. 
fc = 1, or coordination number z = 2). The more general case is a straightforward 
generalisation and can be reduced to the case fc = 1 by a renormalization step, cf. 

In a next step we superimpose this graph with a true random graph, living on 
the same N nodes and having independent edge probability p (cf. for example |15| 
or mi). This entails that the expected number of random edges in our model is 
p ■ N{N — l)/2 (the average vertex degree in the respective random graph) and the 
expected number of random edges being incident with a fixed but arbitrary node, 
Xi, is p ■ {N — 1). Note that with this definition it may happen that some of the 
nearest regular neighbors of a node xq, can now also be linked to Xq by a random 
edge. This plays however no role in the global analysis and could of course be 
avoided but makes the numerical analysis more compact. 

The above p should be compared with the probability, 0, occurring in oi' 
|13j . The latter one is referred to the existing number of regular (non-random) 
edges, that is k ■ N, oi N for k ~ 1. The reason for this derives from the original 
model in which existing edges were randomly rewired. Thus, for k = I, (f> leads 
to an expected number of random edges in the graph equal to (j) ■ N instead of 
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p ■ N{N — l)/2 in our model [N large). The two probabilities are hence related by 

p = 20/(7V-l) (1) 

if we refer them to the same global expected number of shortcuts in the superim- 
posed random graph. 

We are in particular interested in the small world effect. What is usually studied 
is the mean distance^ L{G), between two arbitrarily selected nodes, Xi and Xj. Note 
that graphs are discrete metric spaces in a natural way, the distance d{xi,Xj) being 
given by the length of a shortest path, connecting them (number of consecutive 
edges). If the individual realisations of graphs or networks belong to a sample 
(probability) space, an averaging has to be performed both over the selected pairs 
of nodes and the sample space (cf. IT or 10"). 

This quantity is closely related to another important graph characteristic, the 
(expected) diameter, which we will study in the following. Choosing an arbitrary 
start node, xq, the graph metric allows to define l-neighborhoods, Ui{xo), with 

Ui{xo) ^ {xi, d{xo,Xi) <l} (2) 

and their respective boundaries, defined by 

Ti{xo) = {xi, d{xo,Xi) ^ 1} (3) 

With |r/(a;o)| denoting the number of nodes lying in Ti(xq), the sequence of this 
values is called the distance degree sequence relative to node xq and is denoted 
by dds{xo). When tabulating this for the full node set we arrive at the distance 
distribution dd{G) = {Di, D2, . . • , } with Di the number of pairs of nodes having 
distance equal to I or PD]). We have the following formula for the mean 

distance: 

D 

L{G) ^ M-^ -Y^l ■ Di (4) 

with M — N{N — l)/2 being the number of different pairs of nodes. The number 
D = D{G), that is, the maximal distance occurring in this counting is called the 
diameter of the graph. 

Evidently, L{G) and D{G) cannot be expected to be the same numerically but 
in the generic situation one may surmise that they are closely related and scale 
in the same way for, say, TV — > 00 (being motivated by the qualitative picture of 
spreading in a random graph). While the precise analytic calculation of the degree 
sequence dd{G), the mean distance or the diameter is a quite ambitious task in the 
random graph framework (see for example |15j). the qualitative behavior can be 
inferred as follows. 

If the edge probability, p, is sufficiently low, a randomly selected node, xq, has 
on average p ■ N neighbors and roughly p^N'^ second neighbors and so on as long as 
the number of vertices being reached is not to large compared to the total number 
iV. If this latter condition does no longer hold, the probability increases that one 
meets a given vertex twice. Hence, due to this overcounting the true numbers are 
systematically smaller, the deviations becoming appreciable when TV is approached. 

In the sequel we therefore employ the following strategy. Instead of calculating 
the exact distance degree sequence or the exact diameter of our small world model, 
we calculate, among other things, the number of steps necessary to reach the fraction 
a • of nodes with a preferably chosen as 1/2. In this way we hope to avoid the 
problems of overcounting at least to a large degree, while, on the other hand, we 
expect the scaling behavior of the respective quantities to be more or less the same 
as for the true numbers. 
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3 The Derivation of the Evolution Equations 



As we remarked at the end of the preceding section, we want to estimate the ex- 
pected number of steps necessary to reach, for example, half of the number of 
vertices, starting from a fixed but arbitrary vertex, xg. We expect that this quan- 
tity displays the same N- and p-dependence as the mean distance or the diameter 
of the network or graph under discussion, avoiding at the same time the problem of 
overcounting or, on the other hand, of very complex equations when approaching 
the total number of vertices, N. 

As has been done in we choose the following two variables. 

Definition 3.1 f{n) denotes the expected number of nodes, not reached after n 
steps, starting from an arbitrary but fixed node, xq ("free nodes"). g{n) denotes 
the number of gaps, that is, the number of (connected) segments of nodes, lying on 
the original Zjv, not yet reached and which are separated by the segments of nodes 
already reached after n steps (cf. figure 1). 

We note that our evolution equations describe the evolution of mean- or expected 
values. In some respects this approach hence shares some characteristics with what 
one calls mean-field theory in statistical mechanics (cf. also 13 ). However, we 
think, the approximations being made by us are not so drastic as in typical mean- 
field models, where, among other things, Hamiltonians are typically strongly mod- 
ified (frequently almost linearized). This is not the case in the small- world model 
which, in particular in our approach, contains strongly non-linear terms which en- 
code at least part of the fluctuation content in integrated form (see, for example, 
the discussion about the inclusion of terms incorporating the effects of very small 
gaps around eqn (10)). In a sense, what we call "full equations" in the following 
rather describe the behavior of a "typical" or generic small-world graph. So it does 
not come as a terrible surprise that the evolution equations for the expected values 
are in relatively good agreement with what follows from real numerical simulations 
of the model. 

On the other hand, statistical fluctuations and correlation are not really treated 
by us while this could be done in principle as the underlying probability space 
is explicitly given, that is, the regular graph Z^r superimposed with a random 
graph for which probability theory is well established. One therefore may make the 
slightly vague statement that the small-world model is, depending on the degree of 
approximations, of an intermediate character. 

To arrive at equations which are not only asymptotically correct or are only good 
in a restricted region of parameter or phase space, we try to include as many rele- 
vant terms as possible (under the proviso that the resulting coupled and non-linear 
difference or differential equations are still solvable). We start with the difference 
equation, describing the expected change of the number of gaps between consecutive 
steps. 

For n = we have exactly one gap, comprising all the nodes except xq, that is we 
have g(0) — 1. The number of gaps increases only due to the consecutive inclusion 
of shortcuts with increasing n, connecting pairs of nodes in a random manner and 
being parametrized by the edge probability p. The main contribution in consecutive 
steps, n (n + 1), comes from the term +2pg{n)f{n). We will further explain it 
after the introduction of equation ©. There exists however another contribution 
which acts in the opposite direction and which becomes relevant when already many 
gaps do exist. This term reads —2g(n)^/f(n) and is of a purely combinatorial (more 
involved) character to be explained below when discussing equation ©. 

The initial condition for / reads /(O) A^ — 1. For k — 1 each gap of free nodes 
shrinks by two in the next step provided the gap comprises more than one node. 
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Neglecting in a first step this latter possibility, the first contribution is hence of the 
form —2g{n). Then there is a contribution coming from new shortcuts of the form 
—2pg(n)f{n). The overcounting in the first term (neglection of one- node gaps) has 
now to be compensated by a term +g{n)^ / f{n). The emergence of this and the 
corresponding term in equation will be explained in greater detail below. 

Observation 3.2 

g{n + 1) - g{n) = 2pg(n)f(n) - 2g{nf/f{n) (5) 
f{n + 1) - ,f{n) = -2g{n) - 2pg(n)f(n) + g{nf/f{n) (6) 

We furthermore have the following apriori bounds which immediately follow from 
the meaning of the respective variables in our model system: 

Lemma 3.3 We have gin) < N/2 and g{n) < f{n). 

Proof: Each gap is followed by a non-empty string of nodes being already covered, 
hence the first inequality. The second one follows as each gap contains at least one 
node. □ 

The occurrence of the term 2pg{n)f{n) can be understood as follows. In each 
step, n — > (n + 1), the two endpoints of each of the g{n) gaps may become the 
source of new shortcuts to the remaining f{n) free nodes, the expected number 
being pf{n). This leads hence to a term of the above form in both equations. One 
can even be a little bit more precise if one wants to. New gaps are not created if the 
shortcuts end at free nodes which are adjacent to nodes already reached. There are 
on average 2g{n) of them. That is, in the equation describing the evolution of gaps 
the correct term is 2pg{n)(f{n) — 2g{n)). The equation describing the evolution of 
f{n) is not altered. This additional correction term is always negative and we could 
in principle incorporate it in the following. It will make the whole numerics slightly 
nastier without making a big effect. So we will largely neglect this term but will 
incorporate it into what we call the "full equations", see H18|l . 

The other quadratic nonlinear terms are slightly more intricate and of a more 
stochastical nature. While in equation ^ gaps containing only one node will con- 
tribute only one instead of two nodes in the difference equation, in equation (jsj 
gaps vanish in the next step if at level n they contain only one or two nodes. The 
probability for the existence of such gaps will now be calculated. We begin with 
the case of one-gaps. We will solve this problem with the help of the well-known 
partitioning problem of a given set into disjoint subsets. We associate the set of g 
gaps and / free nodes with / balls to be distributed over g boxes. In general there 
exist 

7 + ^ // + (.9-ir 

9^1 J \ f 
combinations (see j^J or In effect we calculate the number of different words 

of length f+{g + l) consisting of (g -1-1) bars and / dots, under the proviso that each 
word begins and ends with a bar and that each consecutive pair of bars is divided 
by a non-empty string of dots as in our case no box can be empty. This implies 
that we can place exactly one ball in each box and perform the above calculation, 
(7), which represents the number of partions without the constraint of non-empty 
boxes, for the remaining number of (/ — g) balls, yielding 

configurations. This is the cardinality of the set of elementary events in our proba- 
bility space. 



(7) 
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To calculate the expected number of gaps containing only one or two nodes, we 
introduce the following random variables, Yi, 12 over the probability space of words, 
we associated with the random graphs Gj : 

Y,{G,) #{gaps of length i} (9) 

in each of the above A configurations (graphs), Gj. Before we proceed a short 
remark as to the probabilities of the individual configurations should be in order. 

In our model probability space a regular graph is superimposed by a random 
graph with edge probability p. In our above calculation wc deal with fixed numbers 
/ and g. As the gaps arise due to the existence of random edges, the gaps are 
expected to be randomly scattered over the regular graph Z ^ in basically the same 
way as pairs of nodes arc linked by random edges, that is. almost independently. 
This then should also essentially hold for the number of gaps, met after n steps. 
Furthermore, this reasoning should not be affected in a serious way by the possible 
annihilation of gaps for large step-number n, as long as we stay away from the 
regime where the spreading argument for random graphs is no longer correct. From 
this we see that it is a reasonable strategy to remain below the value N/ 2 with the 
step number n. We hence conclude that in the indicated regime each configuration 
should have the independent probability . 

We therefore have 

E{Y^ = A-'-^Y,{Gj) (10) 
j 

For the one-gaps we can represent Yi by more elementary random variables, yk 

with k running from 1 to g, enumerating the existing g gaps and t/fe = 1 if gap (fc) 
contains only one element and zero else. This yields 

Y^{Gj) = J2yk{Gj) (11) 



and 



where (gj^) is the number of configurations with only one element in gap {k). 
Correspondingly we get for the expected number of 2-gaps: 

For the one-gaps this yields 

E{Y^)=g-g/f (14) 



For the two-gaps we have 



and for / large, i.e. / « (/ - 1) « (/ - 2): 

E{Y2) « g ■ {g/f - l/f - g^ I f + gjf) (16) 

With our apriori estimate g < f and as we start from the initial conditions /(O) = 
(iV — 1) S3 iV , g{0) = 1 in most of our phase space the dominant contribution comes 
from the term g ■ g/f also in the case of two-gaps. 
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Conclusion 3.4 The expected number of one- or two-gaps is approximately 



Emi- or 2-gaps)) « 25V/ (17) 

(Note that in our probability space the possibility of being a 1-gap or a 2-gap is 
mutually exclusive). This result explains the occurrence of the correction terms in 
our evolution equations. Without these approximations, the equations (O and © 
would read 

+ 1) - ,(n) . 2,,(/ - 2,) - ^-i^ - '(/-'l)'^()-'2)'^ 

/(n+l)-/(n)- -2.9-2^.9/+^^^^-^ (19) 

with / and g instead of f{n) and g{n) on the rhs. We'll refer to these as "full 
equations" and compare them with the simplified ones ((O and (|SJ)) in figure |2 
Note in particular that we approximated the last term occurring in 1181 bv —5^// 
in 13 neglecting the positive higher order contribution, being essentially of the form 

W/P- 

A brief comment is in order as to the corresponding formulas derived in 
(cf. their formulas (3), (6) or (A10),(A11)). We decided to neglect all terms in 
(|16|l except the leading one, 5'^//, which is reasonable in our view. In 13 , in the 
corresponding equation an additional term of the type g/ f occurs (derived by a 
different argument). On the other hand, more important in our view is equation 
0, which comprises three terms in our approach (including a nasty non-linear one, 
5^//): while in only the first one, —2g, occurs on the rhs. This makes the 
corresponding equations of course much easier to solve but may only be a good 
approximation in a restricted regime of parameter space (see the brief discussion at 
the end of section . We discuss and compare the numerical results in section El 
One can see that the solution of |13j is similar to our lower feownrf-equation for / 
(eqn H45(l ^. which is reasonable as in our lower bound for / the quadratic term is 
largely suppressed (cf. the following section and figure ISJ. 



4 Solution Strategies 

The above system of evolution equations contains non-linearly coupled quantities 
and can be solved only in very exceptional lucky circumstances. Instead of making 
more or less uncontrollable approximations, we develop the following strategy. We 
try to enclose the above exact equations by comparison equations bounding the 
exact solutions, fin) , g{n), from above and below, the corresponding variables 
being denoted by 

7(n) , f_{n) , g{n) g_{n) (20) 

The problem is that, on the one hand, the comparison equations have to be so 
chosen that they can be rigorously solved and, on the other hand, these bounds 
have to be quite good so that we are able to infer something relevant also for the 
enclosed exact equations in particular for the scaling limit N very large and p a 
vanishing function of N . A central role will be played by the value for which 
we have reached on average A^/2 of the nodes when starting from an arbitrary but 
fixed node xq (or more generally aN nodes with < a < 1). In other words, the 
range of n- values we are using is restricted by 

ne[0,n*] so that {N - I) = f{Q) > f{n) > aN (21) 

We now study the rhs of the equations jSJ, For one, we assume that these 
equations have been solved for our initial conditions g{Q) — 1 , /(O) = N — 1, so 
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that g{n) , f{n) now represent particular functions of n. On the other hand we can 
regard the rhs (dropping the dependence on the variable n) as functions on the 
phase space, spanned by the possible values of the variables g , f. In our assumed 
range of possible parameters and variables we have the estimate 

2pgN - 2gyN > 2pgf - 2.9V/ > 2pg ■ (aN) - 2gy{aN) (22) 

The idea is now to introduce the comparison difference equations 

g{n + 1) - g{n) := 2pNg{n) - 2g{nf/N (23) 

and 

g{n + 1) - gin) := 2p{aN)g{n) - 2g{nf / {aN) (24) 
with the initial conditions 

<?(0) = g(0) = 5(0) = 1 (25) 

For the initial differences we have 

5(1) - 5(0) > 5(1) - 5(0) > 5(1) - 5(0) (26) 

Our strategy is now to use these comparison equations to learn something about 
the true equations. Unfortunately matters are not so transparent for difference 
equations as compared to differential equations. The reason is that they are only 
given at discrete points and may (therefore) display a more complex behavior (see for 
example Hoelder's theorem and extensions thereof in ^|,p.283 or 25j,p.220). Due 
to these problems we will in the following go over to the corresponding differential 
equations, being however aware of the fact that it does not seem to be an easy task 
to provide good error estimates, in particular as the differences in our context are 
not infinitesimal (as to this interesting question of principle cf. the discussion in 
\2'i\). What is furthermore remarkable is the observation (see below) that we can 
prove a useful theorem in the case of differential equations the analogue of which, as 
far as we can see, we cannot prove for difference equations (at least with the same 
methods) . 

The corresponding differential equations read: 

g'{x) - 2pg(x)f{x) - 29{xf/f{x) (27) 
fix) - -25(2;) - 2pg{x)f{x) + 9{xflf{x) (28) 

with 5(0) = 1 , /(O) = (A^ - 1) , a; e [0,a;,] so that (A^ - 1) > J{x) > aN. The 
comparison differential equations with respect to g{x) are 

g'{x) = 2pNg{x) ~ 2g{xf/N (29) 
g\x) = 2p{aN)g_{x) - 2g_{xf/{aN) (30) 

with 5(0) =5(0) = 1. 

As to equation (|28|l we proceed as follows (we note that we in fact experimented 
with different possibilities; the one we are presenting below seems to be the most 
appropriate one). We mentioned above the apriori bound 5 < /. For the rhs of 
equation 128|) we then have: 

-25 - 2pgf + g^f < -25 - 2pgf + g = ~g - 2pgf (31) 

On the other hand we have also 

-25-2p5/ + 5V/>-25-2p5/ (32) 
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Therefore our comparison differential equations for / are 

7\x) = -g{x) - 2pg{x)J{x) (33) 
fix) = ~2g{x) - 2pg{x)l{x) (34) 

with 7(0) = N , /(O) =N-2. 

Remark: As N is supposed to be very large, it does not make a big difference for 
the numerical calculations to let all the initial values be equal to N. The above 
choice makes the analytic argument a little bit simpler (see below). 

To further exploit these comparison equations we proceed as follows. Note that 
we have succeeded in decoupling the equations for / and g. We can solve the differ- 
ential equations for g and g and plug the solutions into the /- and /-equation. We 
assume that together with the comparison differential equations the original differ- 
ential equations for g and / have been solved for the mentioned initial conditions. 
In the differential equation for g with initial condition g(0) = 1 we can then regard 
the corresponding /-solution as an external function with g{x) solving this "new" 
differential equation (together with the given initial condition). We compare the 
solution g{x) of this latter equation with the solutions of the differential equations 
for g , g respectively. We have 

g(0) = 5(0) - .9(0) = 1 and ^'(0) < .g'(0) < g'(0) (35) 

We now prove the following result. 

Proposition 4.1 Let yi(x) and y2{x) be solutions of the two differential equations 

y[ (x) = Fi (2/1 ,x),y'^ (x) = {y2 , x) (36) 
on the interval [0,a;»] with j/i(0) < 2/2(0). Let Fi , F2 fulfil 

F^{y,x)<F2{y,x) (37) 

on the domain [0, x^] x ly, ly a suitable 2/- interval and with both yi{x) , 2/2(2^) staying 
in this domain. Then 

yi{x)<y2{x) on [0, x,] (38) 

Proof: From the assumptions it follows that yi{x) < 2/2(2^) in some open interval 
(0, e). If yi{x) > y2{x) for some x, there exists an r > (by continuity) with 
yiif) = y2{i') and yi{x) > 2/2(2^) in an open interval (r,r + e'). But this is a 
contradiction since 

i^i(2/i(r),r) <F2 (2/1 (r),r) (39) 

hence again implying that yi{x) < 2/2(2^) in an open interval (r, r + e"). We conclude 
that 2/1(2;) < 2/2(2;) on [OjX*]. □ 
It sometimes happens that we have Fi{y,x) < F2{y,x) on the open interval 
(0,2;,) but i^i(2/(0),0) ~ F2(2/(0),0) for some value 2/(0). We then can prove the 
following corollary: 

Corollary 4.2 Making the same assumptions as before except for Fi(2/(0),0) = 
F2(2/(0),0) instead of i^i(2/(0),Q) < ^"2(2/(0), 0). We assume that there exists a 
parameter A so that on the closed interval we have 

F^{y,x-\)<F2{y,x) (40) 

for A > and Fi[y,x]Q) = Fi(y,x), the dependence on A being continuous or 
differentiable. Then the parameter dependent solutions converge pointwise towards 
the solutions for A = (cf. HOI)- For A > our above result apphes. So, by 
continuity it applies also in the limit A — s- 0. 
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Note that in our case the parameter A is the parameter taking the values N,N —1 
etc. 

Remark: We surmise that such comparison results are known in the large literature 
about differential equations but we were unable to find a reference. 

Conclusion 4.3 What we have now shown is that under the assumptions being 
made, g , g and / , / are upper and lower bounds of the corresponding solutions 
5 , / of the original differential equations. 



5 The Quantitative Results 

With our a being now either 1 or | we can express both the upper and lower bound 
by a single equation: 

g'{x) = 2apN~g{x)-^^^ (41) 

with g = g for a = 1 and g = g for a = ^. This nonlinear differential equa- 
tion (of Bernoulli type) can be transformed into a linear one with the help of the 
transformation z := and yields with the initial condition ^(0) = 1 the solution 

Inserting these solutions into the corresponding upper and lower bound equations 
for f{x) 

7'{x) = -2pg{x)7{x)-gix) (43) 
fix) = -2pg{x)l{x)-2g{x) (44) 

and introducing the parameter /3 G {1, 2} so that /' = —2pgf — Pgwe obtain (after 
a simple variable transformation, / — > f + l/2p, / — > f + 1/p, and separation of 
variables) the following result, using /(O) = A'' as initial condition instead of A'' — 1: 

2pN + Pf^ 1 , 1 P 



•^^^^ ~ 2p pa^N^^ pa^N^^"^ J 2p ^^^^ 

We are interested in the value i* of x for which f = ^N. Solving for x^, yields 

'2pN + 



' m 



2apN 



pN + 13 



1 



pa^N^ + l\ (46) 



the lower bound being assumed for a = l,/3 = 2 the upper bound for a = i and 

/3 = 1. 

Remark: The argument of the logarithm is always larger than one, as {{2pN + 
(3)/{pN + /3))Vp«iv I g^jjj pa'^N'^ > 0. We furthermore want to stress the im- 
portant point that a too poor approximation of, say, the upper boimd, /(n), may 
easily lead to a function which does not decay sufficiently In that case our estimates 
would have been useless. We see however that our above choice is strong enough. 

We now want to investigate the scaling regime N ^ oo, p = cN~^~^ with 
e e [0, 1], c > 0, with the boundary cases p oc and p oc N~'^ being particularly 
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interesting. Inserting these choices into the preceding equation we get 



2ac 



In 



cN-^ + l3 



- 1 



For e = we have 



2ac 



with Ci being of the precise form: 

Ci = (2ac)-i - In ( ca^ 



InN + Ci{a,l3,c) 



2c + /? y/"" 
c + f3 



1/N 



which becomes asymptotically independent of N for large N. 

For e 7^ we have (developing the logarithm up to the first order) 



2cN-'' + P 
cN-<^+(3 



= exp 
= exp 



ca 
ca 



In 



13 N' 



2c 



In 



1 



and get 



exp — + 0{N-') 



N" ((1 - e) IniV + In ([exp(l/a/3) - 1] ca^ + l/N^-")) 

2ac 



(47) 



(48) 



(49) 



(50) 
(51) 
(52) 

(53) 



which obviously also describes the boundary cases e = and e = 1 (provided we 
would include the neglected term 0{N~'^) which is now 0{1)). This behavior is 
valid both for the upper and lower bound of /. As the for the true / has to lie 
between the respective values for the upper and lower bound we infer that it has 
the same scaling behavior for A'' — > oo. 



Conclusion 5.1 We infer that between p oc N~-^ and p oc N~'^ there exists a 
broad transition zone with the scaling of the diameter or mean distance exactly 
interpolating between these two boundary « IniV and « A (up to 

now we have only studied the scaling of , a parameter which is of course closely 
related to the above mentioned graph characteristics; see below). 

In our above calculations we dealt with the expected value, a;*, at which the 
expected number of free nodes drops to the value N/2. We argued above that the 
corresponding (exact) formulas for the value, at which this number assumes the 
value zero, would be much more complicated. To make nevertheless a statement 
about this value we apply the following (plausibility) argument (which, however, 
should not be viewed as a rigorous proof). Put differently, we will provide an 
argument which is expected to hold only for expectation values or typical nodes. Let 
X be an arbitrary initial vertex and X' a vertex with largest possible distance to X 
in a given realisation of the network. The expected radius of the 7V/2-neighborhoods 
for both vertices is equal to a;*. In case the corresponding a;*-neighborhoods ?7*(X) 
and [/,(X') of X and X' arc not disjoint the distance between X and X' must bo 
less than 2 a;*. If these neighborhoods are disjoint the graph G is a disjoint union of 
U*{X) and U*{X') and hence the distance between X and X' must be exactly 2 x*. 
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As all these arguments apply only to the generic case, we can associate this value 
with the expected diameter of our network, denoted by D and get the estimate 



x^ < D < 2x^ 



(54) 



Remark: We again emphasize that this argument is only correct in an averaged sense 
in which all nodes are assumed to stand on the same footing. It is of course easy 
to design particular graphs where this estimate does not hold. Take for example a 
graph having a densely entangled neigborhood around some node x from which a 
long one-dimensional string emanates. In this case the diameter is of course much 
larger than 2x^. We think one could prove something rigorous at this place, which, 
on the other hand, may be a little bit tedious and unnecessarily blow up the paper. 

On the other hand we infer from equation Q that the average distance, L fulfils 



L<D (55) 

Taking again a typical node, X, (so that its neighborhood has approximately 
N/2 members), we can approximate the mean distance L by {N — 1)^^ • • |r;| 
and get the estimate 

L>{N- l)-^ • ( ^ r |r/| + iV/2 • ) > x,/2 



as 



^ Z - iFil • ^ ■\Ti\>x,-N/2 

1>X, 



1>1 



We thus get 



a;,/2 < L < D <2x* 



(56) 
(57) 
(58) 



We can now insert the respective parameters, a and /3 in our expressions for 
thus yielding 

^a=i,/3=2/2 < x,/2 <L<D<2x^<2 xT^"^^^ (59) 
With our numerical expressions for the upper and lower bounds we get 



■In 



f 2pN 



+ 2 V" 



V pN+2 



- 1 



pN'^ + l) < L < ^7 In 
pN 



f 2pN+ 1 V« 
^ pN+ 1 



+ 1 



(60) 



For very small (e > 1 such that pN'^ — > 0) or vanishing p the inequality reduces 
to 0.16 7V < L < 3.19 iV, with L = 0.25 N for the true L in the case p = 0. For 
p = cN~^~'^, c of order one and e g [0, 1], and a reasonable number of shortcuts 
(pN'^ > 1), L displays a behavior already exemplified for a;,: 



L : 



((l-e)lniV + lnc + Ci(c)) 



(61) 



with Ci constant for e 7^ 0. For very strongly connected graphs (e € [—1,0)), these 
bounds become invalid, as L tends to one and the differential equations will no 
longer approximate the difference equations well enough. 

We want to come back to the question of the importance of the non-linear 
quadratic terms in our evolution equations © and One may be led to the 
wrong conclusion that they are always marginal because initially g is very small 
compared to the huge /. But one should note that g grows very fast for certain 
choices of the parameter p. To get some feeling we take, for example, g and ask for 
what values of x g'^/N is of order g. 



13 



This is the case if g ~ N. The resuh strongly depends on the value of p. Inserting 
p = c/N in the equation for 5 and solving for x we get x w (2c)"^ • IniV. Hence the 
quadratic contribution becomes appreciable when x approaches the regime where / 
drops to N/2, that is, the regime we are interested in. Put differently, it is dangerous 
to neglect this term on apriori grounds. 

On the other hand, taking for example p — and making the same cal- 

culation we infer that the non-linear term remains negligible in the domain we are 
interested in. For e > we get of course intermediate results. 

The effects of neglecting the even smaller combinatorial terms which appear 
in the full eqns H18() and H19() can be seen in figure |2] Here the neglection has a 
greater effect for the case p = 1/iV^ than for p = 1/N, as the total number g of 
gaps in the first case is smaller and therefore nearer to 1 than in the second case. 
Nevertheless, these discrepancies are still negligible. 



6 Comparison with former results 

Barthelemy and Amaral, and later also Newman and Watts conjectured 
a scaling behavior for L of the form 

L = N ■ FipN"^) (62) 

with some universal function F with F{y) — > j for y ^ and F{y) — )■ C \n{y)/y 
for y —^ 00. Our bounds do not scale exactly in this way, but at least do so 
approximately for large N and e > 0. In this regime the piV-dependend term in the 
logarithm tends to the constant exp^l/aP) and our bounds assume the form 



1 



4y 



N-F{pN'^) with —\n[e^/'^-ly + l]<F{y)<-\n[- -y + l 



2 . /e2 - 1 



y 



(63) 

On the other hand, for large pA^ and e < or e = with large c, this scaling behavior 
breaks down and our estimate for the average distance scales like {).-n.N)/pN . This 
estimate makes however only sense for InA^ > piV, as i > 1. Thus the case 
e < isn't described correctly by this formula. For e = 0, L, according to this 
scaling-ansatz, simply scales like (In A^)/c, without any correction term of the form 
(In c)/c, which occurs in our above presumably more exact result. So, although 
L correctly scales with IniV in both cases (depicting a random graph), there exist 
certain deviations for large c. 

In ^], Newman, Moore and Watts found the following expression for their 
universal scaling function 



Fnmw{v) = ^ r-^^— tanh-1 ./^. (64) 

For p oc N^"^ (e = 1) this function is a constant. On the other hand, for p = cN^^~'^ 
with < e < 1 , the argument y = pN"^ is large for large N and 

\npN^ + \n2 jV^((l - e) In A + In c + In 2) 
L = NFnmw{pN)^ ^ « — (65) 

In figure 3 Fnmw and the scaling functions deriving from our bounds are shown, 
indicating that the NMW-ansatz complies with them. Note however that this is 
only valid for e > or e = with a not too large cm p = c/p. 

In Barbour and Reinert made a rigorous analysis of the probability dis- 
tribution for the distance function, getting the following result: Let X and X' be 
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some randomly chosen vertices on G, p = pN and S :— pN"^ (to be identified with 
Lp in '14') then 



d{X,X') > 



h-iS 



'dy 



1 



ye" 



O 



e^(l 



S 



(66) 



for all -i In S' < a; < i In S. L is the mean distance resulting from this distribution 
for d{X, X'). To make things simpler, we instead treat the median of it, which can 
be easily approximated by the special choice x = 0. Neglecting the error term we 
obtain 

P d{X,X') > — « 0,596 (67) 
2p _ 



stating L M L 



median 



lii{pN^)/2pN or, with p = cN''^-^ 
N'iil-e) In TV + In c) 



L: 



2c 



(68) 



The corresponding universal scaling function reads Fbr = (lny)/2j/, and is also 
depicted in figure 3. For pN"^ > 2 (at least one expected shortcut) this lies within 
our bounds. Below this, the error term of the probability P rises above one and 
Fbr looses its meaning. 



7 Dimension of the Small World 

In j24) . two related dimensional concepts (of a fractal type) were introduced for 
infinite graphs (note the close connection to the distance degree sequence, discussed 
in 10 ) and a number of its properties proved. We learned later that this concept 
occurred already earlier in the literature but, as far as we can see, its interesting 
properties were never systematically studied (see for example US]). A technically 
different but physically related concept was exploited by Dhar (UHl)) see also |27| . 
Furthermore Ising models on such irregular spaces were studied recently, an early 
source being |28| . The reason to deal with infinite graphs is that only in the limit 
iV — > oo the global notion of dimension becomes independent of local (model depen- 
dent) aspects like e.g. coordination numbers of, say, lattices, all having the same 
embedding dimension. One of the two definitions reads: 

Definition 7.1 Let G be an arbitrary graph with N vertices and Ui{x) the l- 
neighborhood of the vertex x e G. Then we define the dimension of G (relative to 
x) as 

dim.(G) lim i^^l^ (69) 
i^oo In/ 

(provided the limit exists; in general we have to deal with liminf and lim sup). 

In it was shown, that this notion of dimension (also called the "internal scal- 
ing dimension" ) is independent of the initial vertex x (under a mild technical as- 
sumption). For finite (but large) and connected graphs we can instead employ the 
following graph characteristic: 

dimappro2;(G) = In A^/lndiam(G) (70) 

For real networks this value has been analysed in 29 . There one can see, that the 
approximate dimension might tend to underestimate other notions of dimension, 
because \n^Ui{x) saturates for large x (an effect which is however pretty obvious 
as the spreading argument does of course no longer hold in that regime). On 
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the other hand, in our case we expect \n^Ui{x) not to saturate before reaching 
N/2 =: ^Ui{x^) (cf. figure|2J)- As < dianiG < 2x* (see however the discussion 
after ean lfS^ I. we get 

lnA^-ln2 IniV 
Inx +ln2 - 'i^-PP™- - ■ 

Thus, for large enough N and the approximate dimension doesn't deviate too 
much when staying befow N/2 and won't suffer from saturating-effects. 
Applying this concept to the Small World Model for large N we get 



InTV _ IniV 
Indiam(G) ^ InCi iV^(C2 + InTVi^') 



which is « i for e > 0, the constants Ci and C2 being independent of N (depending 
only on c, a and j3). For e = 1 (p ~ A^~^) we get the value one, which is reasonable 
for this rarefied linear case. For the opposite case, e = {p ^ N^^), we have 

dimapprox (G) « In N/ In In N (73) 

which diverges for N —* 00. 

In Newman et.al. introduced a renormalization process for the Small World 
Model. This process divides the graph into segments of length 2 and interprets 
these segments again as vertices in a new Small World Model with size N' = N/2 
and edge-probability^ p' = Ap. With p = cN~^^'^ this substitution yields p' = 
c' N'~^~'^ with c' = 2^~'^c. Hence our dimapprox = 1/e is constant under this 
renormalization. A similar phenomenon was observed in the renormalization process 
for infinite graphs (with globally bounded node degree) introduced in [^. 



8 Conclusion 

We found the mean distance L of a Small World Model with ^ 1 nodes and edge- 
probability p = cN~'^~'^ to be bounded from above and below by two expressions 
of the form G2iV'^(ln7V^"'^ -I- Gi) (the constants depending on whether the upper 
or lower bound is taken). This implies a broad transition zone, in which the mean 
distance drops from a linear growth to a logarithmic one, permitting each power 
law L ~ N*^ for e G (0, 1). Furthermore, 1/e can be regarded as an approximative 
dimension of the corresponding graph. Our results partly corroborate earlier work 
but lead also to certain numerical deviations. 
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Figure Captions 



Caption Figure 1: The smallworld model for k ~ 1. The number of nodes is 
= 30. In this particular realisation we have inserted four additional shortcuts. 
The unfilled nodes are the vertices which can be reached by < 3 steps starting from 
node X, the step-number will be denoted by n in the following. The black nodes 
are the vertices not reached after three steps, their cardinality being denoted by 
f{n) (free nodes). This set consists of three connected subsets which are separated 
by the subsets of nodes already reached. The number of segments of free nodes is 
denoted by g{n) (gaps). 

Caption Figure 2: The function f{n) for two examples. Left side: N = 10"*, p = 
IQ-'^, e = |; right side: N = 10^, p = 10"^ e = i (with p normalized to iV-(l + e) 
). The upper diagrams show 20 realisations, their mean value (averaged over fixed n) 
and the analytical bounds of H45|) . The lower diagrams show these bounds and the 
averaged curve again, together with the numerical solutions of the difference eqns 
(|18I19|I and jS] EI) . In both cases the averaged curve exceeds the solution of the 
difference eqns (notice however the perhaps surprisingly large standard deviation), 
but doesn't top the upper analytical bounds. The solutions of the full eqns (fH^ 
I19|l only differ notable from the simplified ones in the left, nearly linear case, 

the neglection of the additional combinatorial terms was hence justified. Yet, what 
can't be seen in these diagrams is that, in contrast to the full ones, the solution of 
the simplified eqns does not sink to zero, and even crosses the upper boundary for 
large n. This is however neither surprising nor important, as our boundary eqns are 
only guaranteed to hold in the interval f{n) £ [N/2, N], thus we omit an additional 
logarithmical diagram. 

Caption Figure 3: Universal scaling functions 
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